Skip to content

custom domains - auth sync #2180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 78 commits into
base: master
Choose a base branch
from

Conversation

Soxasora
Copy link
Member

@Soxasora Soxasora commented May 25, 2025

Description

Part of #1942
Focuses on synchronizing authentication between SN and a custom domain.
It features a new sync endpoint that checks if there's a session and if so, redirects to the custom domain with a verification token that gets exchanged with a session token via POST.

Media

Auth Sync Login

Screen.Recording.2025-05-25.at.21.39.07.mp4

Additional Context

The interested code is in /api/auth/sync.js and /middleware.js
Edits to /nav/common has been made to accommodate Next Auth

Flow:

  1. User, on custom domain, goes to /login or /signup
  2. Middleware redirects to /api/auth/sync on SN and checks if there's a session
    ---> yes: Issue a verification token
    ---> no: Redirect to /login or /signup on SN with callback /api/auth/sync so we can resume auth sync
  3. Redirect back to custom domain with the verification token
  4. Middleware checks if there's a token param and make a POST to /api/auth/sync to exchange this token with a session token
  5. Middleware applies the session token to a session cookie

Using a verification token and validating it with a POST is safer than the previous method of exposing a JWT directly

Future TODOs with priority:

  • a page asking user confirmation before proceeding - 3/10
  • pick an SN account to synchronize - 3/10
  • multi auth - 6/10

Checklist

Are your changes backwards compatible? Please answer below:
Yes, only engages on custom domains

On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:
6, Q/A OK, edge cases handled correctly

For frontend changes: Tested on mobile, light and dark mode? Please answer below:
n/a

Did you introduce any new environment variables? If so, call them out explicitly here:
n/a

Soxasora and others added 30 commits May 1, 2025 18:38
- ACM support
- custom domains crud, resolvers, fragments
- custom domains form, guidelines
- custom domains context
- domain verification every 5 minutes via pgboss
- domain validation schema
- basic custom domains middleware, to be completed
- TODOs tracings
- CustomDomain -> Domain
- DomainVerification table
- CNAME, TXT, SSL verification types
- WIP DomainVerification upsert
…ange status of a Record from its Attempt, multi-purpose dns verification
- use DomainVerificationStatus enum for domains and records
- adapt Territory Form UI to new schema
- return 'records' as an object with its types
- wip: prepare for attempts and certificate usage for prisma
fix:
- fix setDomain mutation transaction
- fix schema typedefs

enhance:
- DNS records guidelines with flex-wrap for longer records

cleanup:
- add comments to worker
- remove console.log on validation values
… HOLD

handle territory changes via triggers
- on territory stop, HOLD the domain
- on territory takeover from another user, delete the domain and its associated records

handle ACM certificates via trigger
- on domain/domainCertificate deletion, ask ACM to delete the certificate via a pgboss job; removes the need to ask ACM in multiple places

clear domains that have been on HOLD for more than 30 days, check every midnight via pgboss schedule

use 'domains' profile for worker jobs
@Soxasora Soxasora marked this pull request as ready for review June 2, 2025 15:02
@Soxasora Soxasora marked this pull request as draft June 6, 2025 19:25
- user's csrf cookie is sent as 'state' to the sync endpoint
- 'state' is preserved in the DB, coupling it with the verification token
--- verificationToken|csrfToken
- consuming a verification token via POST requires the csrfToken

- consuming a token also means deleting it from DB, a transaction is used to ensure this
@Soxasora Soxasora force-pushed the custom_domains_authsync branch from 17568c5 to 932f4e3 Compare June 10, 2025 09:15
@Soxasora Soxasora force-pushed the custom_domains_authsync branch from 3335d66 to f25b57a Compare June 10, 2025 23:23
…to compare it with the POST sent CSRF token, to get the final JWT session
@Soxasora Soxasora force-pushed the custom_domains_authsync branch from f25b57a to 1346057 Compare June 10, 2025 23:25
Comment on lines 126 to 127
// store encoded state JWT with the verification token to prevent CSRF attacks
token: `${randomBytes(32).toString('hex')}|${state}`,
Copy link
Member Author

@Soxasora Soxasora Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much, switched to token in URL as it shouldn't be logged.

Tie a CSRF token to a verification token There were some obstacles because of the cross-domain nature, but I think I found a middle ground to safely use CSRF to tie the verification token to the user. Knowing that: - messing with next auth is not preferable and thus can't use its protections - we need to protect the token from being stolen - middleware doesn't support node We have another option, which is to **encode** the CSRF token in a _state JWE_ and store it alongside the verification token.

When going to the sync endpoint, we

  • detect login -> encode CSRF token in state JWE -> /api/auth/sync/?...&state=encodedState

The sync endpoint will then

  • create a token like token|encodedState -> redirect back with just the token

Middleware at this point will

  • POST /api/auth/sync with body: {token, csrfToken} -> retrieve the verification token object from DB -> decode the encodedState -> match decodedState with csrfToken

If everything is okay, it returns the final ephemeral JWT.

Everything happens blazingly fast, if someone steals the verification token and/or the state, they can’t go that far without the CSRF token that sits and never moves in the user’s browser (well, except for the POST, but that's secure).


side note: still checking for less hacky ways/can make it less hacky

Soxasora added 4 commits June 13, 2025 12:29
Handler:
- use constants
- better naming
- throw error when a verification token hasn't been found

Middleware:
- POST to auth sync handler WITH timeout
- codebase-conformative fetch composition
- don't decode redirectUri
@Soxasora Soxasora marked this pull request as ready for review June 13, 2025 20:02
return redirectToDomain(res, domain, newVerificationToken.token, redirectUri)
}
} catch (error) {
return res.status(500).json({ status: 'ERROR', reason: 'auth sync broke its legs' })
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the handler doesn't expose any fatal error, instead it returns this joke reason!
I'm divided by logging the failed, fatal, attempt in DB or just print it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the handler doesn't expose any fatal error, instead it returns this joke reason!

If this is an error message that a user might see, I'd personally prefer to keep it serious because they might not be in the mood for jokes when they can't login 👀

I'm divided by logging the failed, fatal, attempt in DB or just print it.

I think you can just print it as we do with most errors

Copy link
Member

@ekzyis ekzyis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really get far with the review today, sorry, will continue the review tomorrow

return redirectToDomain(res, domain, newVerificationToken.token, redirectUri)
}
} catch (error) {
return res.status(500).json({ status: 'ERROR', reason: 'auth sync broke its legs' })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the handler doesn't expose any fatal error, instead it returns this joke reason!

If this is an error message that a user might see, I'd personally prefer to keep it serious because they might not be in the mood for jokes when they can't login 👀

I'm divided by logging the failed, fatal, attempt in DB or just print it.

I think you can just print it as we do with most errors

}

// STEP 2: check if domain is valid and ACTIVE
const domainValidation = await checkDomainValidity(domain)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how important exactly this is, but afaict, you're only checking if the domain parameter is valid; what about the redirect uri?

Copy link
Member Author

@Soxasora Soxasora Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that a URL with domain and path (redirectUri) separated was nicer to read, so at the end I combine the two after separate validation. Even though the validation for redirectUri is just checking if it starts with /.
...
I'm pushing a regex to only allow conformant paths.

On the domain validation, the goal is to only allow ACTIVE custom domains to create/receive a verification token, which I think it's necessary to limit requests. I also just remembered that the POST is open...
Maybe a safer way to limit external access is to present encrypted params 🤔

Edit: In some of my sketches I used to present everything the endpoint would need under an encrypted state JWT, it worked pretty well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants